Tracking the Intangible: Quantifying Effort in NFL Running Backs

Authors

Emily Shteynberg

Luke Snavely

Sheryl Solorzano

Published

July 25, 2025


Introduction

While athletes are often praised for “giving 100%,” what that entails remains poorly defined and difficult to measure objectively. Unlike physical traits such as endurance, strength, and agility, effort is intangible and often conflated with performance outcomes or inferred through subjective observation. Nevertheless, it is widely regarded as a crucial factor in athletic success, influencing both perceived competitiveness and in-game results. Developing a reliable, standardized method to evaluate effort could offer new insight into player capabilities and decision-making.

This study focuses on running backs (RBs) in the NFL, a position characterized by short, high-intensity bursts of movement within clearly structured offensive plays. The nature of their role—frequent accelerations and heavy physical contact—makes RBs a suitable candidate for analyzing exertion in a relatively controlled and consistent context.

Our approach builds on prior work deriving professional soccer players’ theoretical maximum acceleration capacity as a function of running speed in-situ.1 To date, no comparable methodology has been applied to American football. By adapting it to RBs, we aim to isolate the measurable, movement-based component of effort, independent of tactical decisions or situational context.

The objective of this study is twofold: (1) to improve estimation of individual acceleration-speed (A-S) profiles using tracking data, and (2) to assess how frequently players operate near or exceed these limits as a proxy for effort.

Data

The data were obtained from the NFL Big Data Bowl 2025, an annual league-sponsored analytics competition that provides game, play, player, and tracking data recorded at 10 frames per second.2 The dataset covers weeks 1-9 of the 2022 NFL season, comprising 136 games.

Pre-processing:

  • Using the tracking data, we first standardized players’ positional coordinates and orientations so that all plays move from left to right, with the bottom-left corner of the field set as the “origin.”

  • After merging the data, we filtered for running plays where the RB was the ball carrier and restricted the sample to RBs with at least 20 rushing attempts, yielding 69 players. Each observation corresponds to a single frame of tracking data.

Feature engineering:

  • The tracking data provided only the magnitude of acceleration. To be able to differentiate between positive acceleration and negative acceleration (deceleration), we derived a variable for directional acceleration by multiplying the given magnitude of acceleration by the cosine of the angle of player motion. Speed and directional acceleration were converted to miles per hour (mph) and miles per hour per second (mph/s), respectively.

  • We also created several variables, all measured at the time of handoff, for our validation model (see Methods) as follows:

    • Distance to nearest defender: compute the Euclidean distance between all defenders’ and RB’s positional coordinates; select the minimum distance

    • Number of defenders within 5 yards of the RB: compute the Euclidean distance between all defenders’ and RB’s positional coordinates; count the number of defenders whose distance is at most 5 yards

    • Number of blockers in front of the RB: the number of blockers (defined as offensive linemen or fullbacks) whose x-coordinates exceed that of the RBs

Methods

Motivation

Our initial approach was adapted from the aforementioned prior research building individual A-S profiles of soccer players.1 For each RB, we plot the frame-level A-S profile based on the maximal acceleration they could generate for every possible running speed using Morin et al., 2021 procedure as follows:

Within a speed interval ranging between 3 mph and the RB’s maximal speed, the two maximal acceleration values attained for each 0.2 mph sub-intervals were selected. A first linear regression was fitted to these speed-acceleration points. Outlier points lying outside of the 95% confidence interval around the linear function were removed. A final linear regression was fitted to the remaining points, defining what we refer to as the “maximum acceleration frontier.”

To compute “effort” from the A-S profile, the regression line was first shifted downward by 0.25 units to include points sufficiently close to the estimated frontier. Effort was then defined as the percentage of a player’s points that fell above this relaxed threshold.

However, this approach has several limitations. First, it excludes points with speeds below 3 mph from the linear regression, effectively disregarding low-speed frames—even though effort may still be exerted at lower velocities. Second, the linear model extrapolates beyond the observed data, leading to unrealistic estimates of theoretical maximum speeds. Third, players with greater athletic ability may be disproportionately penalized. Because the frontier is estimated relative to each player’s own maximum accelerations, those with higher physical capacity face a stricter threshold, making their efforts appear less frequent with respect to their full potential. Finally, the model does not differentiate between acceleration and deceleration; all changes in velocity are treated equally, despite deceleration typically requiring less effort than acceleration3 and often reflecting tactical or situational constraints outside of the player’s control.

For every RB, we again examine the joint distribution of frame-level speed and acceleration. We define two approaches to evaluate effort as follows:

Metric #1: Quadratic Quantile Regression

To address the shortcomings of the linear regression approach, we developed an effort metric based on quadratic quantile regression. This method was selected for two reasons: first, the relationship between acceleration and speed is empirically nonlinear; second, quantile regression enables an estimation of the upper envelope of a player’s acceleration distribution. Unlike traditional ordinary least squares techniques (e.g., linear regression), which estimate the conditional mean, quantile regression is more appropriate, in this case, for identifying players’ maximum acceleration capacities. Specifically, the 0.95 quantile was used to represent each player’s maximum acceleration frontier, capturing some of their highest observed exertions while minimizing the impact of outliers (in, say, the 0.99 quantile) that could distort the frontier.

For every RB, we again examine the joint distribution of frame-level speed and acceleration. We fit two separate player-specific quadratic quantile regressions to estimate the 0.95 quantiles (i.e., the 95th percentiles) of acceleration and deceleration, respectively, as functions of speed. Each model was trained on all available frames for a given player. The resulting curves define the individualized maximum acceleration frontiers: one for positive acceleration values and another for negative acceleration values.

…..

plotly

Metric #2: Quantile Generalized Additive Model

  • This metric quantifies how often a runnng back comes close to their maximal acceleration capacity.

  • Assumption of this effort metric: high acceleration and/or high speed movements are effortful.

  • For each running back, two regression models with adaptive spline bases were fitted to the 0.98 quantiles of positive and negative acceleration, respectively, both as functions of running speed. A vertical line at the 0.99 quantile of speed was also drawn.

  • This metric was derived by

           [LATEX equation] 

where di is the shortest distance from each point to its corresponding quantile regression line (either for positive or negative acceleration) or to the vertical line at the 0.99 quantile of speed.

  • For points outside the quantile regression lines and/or the vertical line, the distance di was set to 0.
  • For negative acceleration points, the effort score[change] was penalized with by a factor of 0.5 as deceleration is deemed less effortful.

Qgam

testing out a few players

Validation Model:

As a way to validate our effort metrics, we developed an Expected Points Added (EPA) over expected model. EPA is derived from Expected Points (EP), which estimates how many points a team is likely to score on that drive. EPA is the difference of EP before the play and after the play. We chose to model EPA because the metric gives value to how a play contributes to helping a team score. By correlating the residuals of the model with our effort metrics, we could quantify how much effort contributes to EPA.

For our model, we only used predictor variables at the time of the handoff to the RB. Utilizing full play metrics creates data leakage, as EPA can only be calculated after a play is over.

We are interested in predicting EPA using the following variables:

  • Game context
    • Home field advantage
    • Quarter
    • Down
    • Score differential
  • Play context
    • Yards to go to a first down
    • Yards to go to the endzone
    • Number of blockers in front of the RB
    • Number of defenders within 5 yards of the RB
    • Offensive formation
    • Run concept
  • RB characteristics
    • Speed
    • Directional acceleration
    • Weight
    • Coordinates of RB on field
  • Nearest defender characteristics
    • Speed
    • Directional acceleration
    • Angle with the RB
    • Distance to RB

After finalizing our variable selection, we utilized 5-fold cross validation, ensuring each game was in a different fold to reduce possible dependence. We used a random forest model due to its fair accuracy and its propensity to avoid overfitting.

Top Running backs

Top 10

linear regression

Linear quantile regression for acceleration(mph/s) vs speed(mph)

Results

[Describe your results. This can include tables and plots showing your results, as well as text describing how your models worked and the appropriate interpretations of the relevant output. (Note: Don’t just write out the textbook interpretations of all model coefficients. Instead, interpret the output that is relevant for your question of interest that is framed in the introduction)]]

scatterplot

correlation between metric #1 and metric #2

Discussion

[Give your conclusions and summarize what you have learned with regards to your question of interest. Are there any limitations with the approaches you used? What do you think are the next steps to follow-up your project?]

  • AS profile framework didn’t translate into football (maybe it works for other sports where players don’t get tackled but just run at hand-off)
  • there are many dependencies in NFL
  • this is a productive place to start quantifying effort
  • further research should look into taking into account other factors/variables

Appendix

Trying a different interactive layout

Acknowledgements

Many thanks to Sam Ventura, from the Buffalo Sabres, for his guidance and mentorship throughout the project. We are grateful for the time and dedication Sam has put into this project. We also want to thank Quang Nguyen for his instruction, advice, and support throughout the summer research experience. Quang had provided us with the resources necessary to get started with the NFL Big Data Bowl tracking data. Finally, we would like to thank Dr. Ron Yurko and the CMSAC teaching staff for sharing with us their knowledge and inspiring ideas.